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Our  objective  is  to  develop  methodology  for  analyzing  life  test  data.  Initially, 
we  have  only  data — no  mathematical  models.  Through  an  exploratory  data 
analysis  or  an  analysis  based  on  the  physical  processes  generating  the  data, 
we  may  judge  an  exponential  life  distribution  model  as  appropriate  for  the 
analysis  of  the  data.  Specifically: 


(0.1)  F(x\k)  —  1  —  exp  [—  ax]  ,  j?>0,  A>0, 

where  A  is  the  unknown  constant  failure  rate.  The  vertical  bar  mF(x\k)  indicates 
that  we  are  conditioning  on  the  parameter  A;  i.e.,  for  specified  A  the  distribution 
is  exponential  with  failure  rate  A.  The  corresponding  density  is 


(0.2) 


f(x (A)  =  A  exp  [—  Aa?] , 


®>0,  A  >  0. 


1.  -  Basic  concepts. 

To  begin  with,  we  suppose  the  life  test  data  consist  of  observed  complete 
lifetimes  xlfx2, ... ,  x„  on  n  units.  For  example,  table  1.1  lists  lifelengths  ordered 
by  rank  of  100  Kevlar  49/Epoxy  strands  subjected  to  a  high  static  load  [1], 
An  exploratory  data  analysis  indicates  that  an  exponential  life  distribution 
may  be  appropriate  for  analyzing  these  lifelengths.  Thus  we  assume  that  the 
observations  constitute  a  sample  of  n  independent,  identically  distributed 
random  variables  with  distribution  F  given  by  (0.1).  Although  we  assume  a 
fixed  A  exists  which  specifies  F,  wo  are  uncertain  as  to  the  true  value  of  A  and 
seek  a  method  which  uses  the  data  to  express  probabilistically  our  uncertainty 
regarding  the  true  value  of  A. 
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Table  1.1.  -  Times  to  failure  of  strands  subjected  to  stress  at  80  %  of  mean  rupture 
strength. 


Rank 

Hours 

Rank 

Hours 

Rank 

Hours 

Rank 

Hours 

1 

1.8 

26 

84.2 

51 

152.2 

76 

285.9 

2 

3.1 

27 

87.1 

52 

152.8 

77 

292.6 

3 

KSH 

28 

87.3 

53 

157.7 

78 

295.1 

4 

6.0 

29 

93.2 

54 

160.0 

79 

301.1 

5 

7.5 

30 

103.4 

55 

163.6 

80 

304.3 

6 

8.2 

31 

104.6 

5(> 

166.9 

81 

316.8 

7 

8.5 

32 

105.5 

57 

170.5 

82 

329.8 

8 

10.3 

33 

108.8 

58 

174.9 

83 

334.1 

9 

10.6 

34 

112.6 

59 

177.7 

84 

346.2 

10 

24.2 

35 

116.8 

60 

179.2 

85 

351.2 

11 

29.6 

36 

118.0 

61 

183.6 

86 

353.3 

12 

31.7 

37 

122.3 

62 

183.8 

87 

369.3 

13 

41.9 

38 

123.5 

63 

194.3 

88 

372.3 

14 

44.1 

39 

124.4 

64 

195.1 

89 

381.3 

15 

49.5 

40 

125.4 

65 

195.3 

90 

393.5 

16 

50.1 

41 

129.5 

66 

202.6 

91 

451.3 

17 

59.7 

130.4 

67 

220.2 

92 

461.5 

18 

61.7 

43 

131.6 

68 

221.3 

93 

574.2 

19 

64.4 

44 

132.8 

69 

94 

653.3 

20 

69.7 

45 

133.8 

70 

251.0 

95 

663.0 

21 

70.0 

46 

137.0 

71 

266.5 

96 

669.8 

22 

77.8 

47 

140.2 

72 

267.9 

97 

739.7 

23 

80.5 

48 

140.9 

73 

269.2 

98 

759.6 

24 

82.3 

49 

148.5 

74 

270.4 

99 

894.7 

25 

83.5 

50 

149.2 

75 

100 

974.9 

The  first  step  is  to  evaluate  the  joint-probability  density  of  the  random 
lifetimes  X, evaluated  at  the  observed  values  x„  ...,  xn.  Since  we 
are  assuming  that  are  independent  given  A,  the  joint  density  of 

the  observed  values  is 

(1.1)  fl  /N*)  =  A"  exP  [- A  i  *<1  • 

<-l  L  i-1  J 


1*1.  The  likelihood  function.  -  To  focus  attention  on  the  parameter  of 
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interest  A,  we  regard  (1.1)  as  a  function  of  A  and  call 
(1-2)  Z(A xn)  =  An exp  [  —  A  £ a?,] 


the  likelihood  function.  (The  likelihood,  although  a  function  of  the  parameter  A, 
is  not  a  probability  density  in  the  parameter.  Hence  the  vertical  bar  in  L  is 
used  to  indicate  that  the  data  to  the  right  of  the  vertical  bar  are  given.)  The 
likelihood  function  provides  a  means  of  quantifying  the  information  contained 
in  the  data  concerning  the  unknown  true  value  of  the  exponential  parameter  A. 

Suppose  a  unique  value  A  of  A  exists  maximizing  the  likelihood  function, 
then  we  call  A  the  mode  of  L{  Aj.r, , ...,  xn)  and  the  maximum-likelihood  estimator 
(MLE)  of  A.  In  general,  the  MLE,  when  it  exists,  is  a  very  useful  concept. 

To  simplify  the  calculation  of  /,  we  use  the  fact  that  the  maximum  of  the 
likelihood,  when  it  exists,  is  achieved  at  the  same  value  of  A  as  is  the  maximum 
of  the  logarithm  of  the  likelihood.  Thus  we  compute 

d  T. .  .  n  » 

In i( /, j.r, ,  • ». ,  xn )  —  .  ^  Xj 

and  set  the  derivative  equal  to  0.  We  readily  obtain 

X  —  n  (J><) 


and  verify  that  A  maximizes  i(A|a?1, xn)  for  fixed  xlf ...,  xn . 

The  MLE  X  may  be  a  very  satisfactory  estimator  of  the  unknown  failure 
rate  A  for  moderate  to  large  sample  sizes  n.  (Caution:  For  more  complex  life 
distribution  models  involving  an  infinite  number  of  unknown  parameters,  the 
MLE  of  the  parameters  may  be  quite  misleading.  See  [2]  for  an  example  in 
which  the  MLE  converges  to  the  icrong  set  of  the  parameter  values  even  though 
the  sample  size  tends  to  infinity.  Also  see  [3],  p.  12,  for  a  similar  two-parameter 
example  and  [4],  p.  34,  for  a  one-parameter  example.)  In  the  present  case  of 
estimation  of  the  single  parameter  A  of  the  exponential,  our  uncertainty  as  to 
the  true  value  of  A  stems  from  the  fact  that  our  sample  size  n  is  finite. 

We  express  our  uncertainty  concerning  A  by  means  of  a  probability  dis¬ 
tribution  for  A.  To  display  explicitly  this  point  of  view,  we  let  X  denote  a  random 
variable  expressing  our  uncertainty  concerning  the  unknown  true  value  of  A. 

Bayes'  theorem.  A  key  theorem  based  on  this  point  of  view  is  the  funda¬ 
mental  Bayes’  theorem.  It  provides  a  method  for  computing  the  probability 
density  of  the  random  variable  expressing  our  uncertainty  concerning  the  param¬ 
eter  conditioned  on  the  observed  data. 
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1‘2.  Theorem  (Bayes'  theorem).  -  Let  a)  X  and  6  be  random  variables  with 
joint-probability  density  p(x,  8),  b)  p(x\8)  and  p(8\x)  denote  the  corresponding 
conditional  densities,  and  c)  n(0)  denote  the  marginal  density  of  0.  Let  & 
be  the  parameter  space,  i.e.  8  e  O.  Then 


(1.3) 


p(0|.r) 


j>(.*j0)?t(9) 

jp(x\6)ji(6)d0  ' 
e 


Proof.  The  joint  density  p(x,  0)  of  X  and  8  may  be  written  as 


p(x,  8)  —  p[x\8)n(8) . 


By  definition  of  a  conditional  probability  density, 

p(8\x)  =p(x,  8)/p(x) , 


when  p(x)  >  0,  where 

p(x)  =  jp(x\d)7t(0)dd . 
e 

By  combining  the  three  equalities  in  the  steps  just  above,  we  immediately 
obtain  the  desired  conclusion  (1.3).  || 

Prior  and  posterior  distributions.  Before  analyzing  statistical  data,  it  is 
helpful  and  efficient  to  assess  prior  knowledge.  A  convenient  way  to  accomplish 
this  is  to  formulate  a  probability  density  on  the  parameter(s)  of  the  model 
selected.  Once  we  select  an  appropriate  model,  and  a  prior  distribution  on  the 
parameter  space  for  that  model,  we  may  complete  a  useful  and  informative 
data  analysis  in  an  unambiguous  fashion  using  only  the  standard  calculus  of 
probability  theory. 

The  prior  density.  First,  we  confine  our  choice  of  prior  densities  to  proper 
densities.  A  density  ji(-)  is  proper  if  J;r(A)d/  exists  and  equals  one. 

Next,  to  motivate  the  concept  of  a  natural  conjugate  prior  for  X,  we  sup¬ 
pose  that,  in  the  particular  problem  under  discussion,  we  have  very  little  prior 
information  concerning  X.  It  seems  natural  to  assume  initially  a  rectangular 
prior  density: 

for  0<A<if, 
otherwise, 

where  if  is  a  very  large  number  (say  M  =  101*).  Under  this  assumption  on  the 
prior  density,  we  assign  the  same  probability  that  X  is  in  any  interval  in  [0,  M) 


= 


if-* 

0 
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of  a  specified  length.  For  example,  the  a  priori  probability  that  5<  A<10 
is  the  same  as  the  a  priori  probability  that  19  <  A  <24. 

The  posterior  density.  Suppose  we  have  observed  a  sample  of  n  lifelengths 
xx, ...,  xn  having  joint  density  p(xx, ...,  x„\k).  By  Bayes’  theorem,  the  posterior 
density  of  1  based  on  is  given  by 


jt, ( a , ... ,  xn )  —  p{^i i  ••• » *®n|  A) •'ro(A)  j » •••  >  ®n|A) n0{X)  dA . 


Recall  that  the  likelihood  L(/.\.vx, ...,  xn)  —  p(xx, ... ,  xn\X),  namely  the  prob¬ 
ability  density  of  the  observed  outcome  considered  as  a  function  of  the  param¬ 
eter  A.  Thus  we  may  write 


... ,  xn )  —  -f  (  a\xx , ... ,  j  j  L(X\Xx , ... ,  ®n)^o(A) *1/ . 


Since  A  has  been  integrated  out  in  the  denominator,  the  denominator  is  now  a 
constant  with  respect  to  A.  Hence 


nX(X\Xx,  *.«,  Xn)  OC  L{k\XXy  ...,  ^n)^o(^)  > 

where  oc  means  «  proportional  to  ».  Thus  the  right-hand  side  is  the  same  as 
the  left-hand  side  up  to  a  constant  which  does  not  depend  on  the  parameter  A. 
Notice  that,  since  the  data  xx, ...,  x„  have  already  been  observed,  the  data 
are  not  considered  variables  at  this  stage  of  the  analysis. 

Assuming  the  rectangular  prior  7r0(A)  =  if-1  for  0  <  A  <  if,  we  obtain  for 
the  posterior  density  of  / 

nx(X\xx, ....  xn)  =  expj^— A  ^  J?,jy J/^Ja"  exp  j— A  £  .r,  j  dA  . 

(1 

n 

For  n~l  ^  »  if-1,  this  is  approximately 

1 

(1.4)  Xi{X\xx, ...,  xn)  =*=  (|*,)"+lA-  exp  [—  xj±xtyr(n-\-l) , 

00 

where  /’{n  +  1)  =|«"  exp  [—  u ]  d«  is  the  n!  function.  In  computing  (1.4)  we 
0  » 

have  used  the  fact  that  exp  [—  cu]  dt*  =  r(n  -j-  1)  for  all  o  >  0. 

0 

Thus,  if  we  assume  initially  a  rectangular  prior  on  [0,  if],  if  large,  the  re¬ 
sulting  posterior  density  is  approximately  of  the  form 


(1.5) 


W®-1  exp  [—  6A]/.T(a) 


for  A  >  0 , 
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where  a,  b  >  0.  This  is  approximately  a  gamma  density  with  shape  parameter 
a  =  n  -j-  1  and  scale  parameter 

b  =  j[txi. 

i-1 

Xow,  suppose  we  obtain  an  additional  independent  random  sample  of 
lifelengtlis  ylf  y ym.  Then  it  is  reasonable  to  use  as  our  new  prior  density 
the  posterior  density  (1.4)  obtained  from  the  previous  sample.  Using  as  our 
new  prior 

xdA)  =  b',/:,~l  exp  [—  b/.]jr(a) 

n 

with  a  =  n  —  1  and  b  =  ^  xi}  we  obtain 

1 

A"  exp  [—  A  2  1/.]  ba/*~l  exp  [—  &A]/.T(a) 

....  ym)  = 


or 

(1.6)  ff,(A|  y„ ...»  y,„)  =  (&  +  5  l/,)m"aAm+a-1  exp  J-  /.(b  +  f  y)j^jr{m  +  a) . 

Thus  a  is  increased  by  the  additional  number  m  of  observed  failures  and  b 

m 

is  increased  by  the  additional  quantity  ^  yt  to  obtain  the  new  posterior  density 

i 

x, ;  note,  however,  that  the  form  of  the  posterior  density,  the  gamma,  is  retained. 

Because  of  this  preservation  property  (the  gamma  prior  used  in  the  ex¬ 
ponential  model  leads  to  a  gamma  posterior),  the  gamma  prior  is  called  the 
«  natural  conjugate  »  prior  for  the  parameter  A  in  the  exponential  model.  More 
generally,  a  family  of  prior  distributions  is  « conjugate  »  with  respect  to  a  given 
statistical  model  if  the  form  of  the  posterior  in  each  case  is  the  same  as  that  of 
the  prior  and  it  is  the  minimal  such  family;  parameters  of  the  posterior  dis¬ 
tributions  will,  of  course,  change  in  accordance  with  the  data  observed. 

In  the  present  case,  we  can  interpret  the'  prior  density  parameter  a  —  1 
(if  a  is  an  integer  >1)  as  the  number  of  observations  in  a  previous  experiment 
(actual  or  conceptual)  and  b  as  the  corresponding  total  time  on  test. 

In  the  present  exponential  model,  the  gamma  prior  for  1  is  mathematically 
convenient  and  has  intuitive  interpretation  in  terms  of  an  equivalent  sample. 
However,  the  analyst  is  not  confined  to  a  choice  within  this  family.  Rather, 
the  choice  of  the  prior  distribution  should  always  reflect  the  best  possible 


co 

Ja*  exp  j—  A  ^  yijp)0/.0-1  exp  [—  f>A]/.T(a)]  dA 
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specification  of  the  analyst’s  prior  information  concerning  tin*  unknown  param¬ 
eter.  Thus,  in  reporting  the  results  of  the  data  analysis,  the  analyst  should 
present  his  specification  of  the  prior  and  the  basis  for  his  choice. 

1*3.  Example.  -  Table  1.1  lists  the  observed  lifetimes  of  100  organic  fiber 
strands  subjected  to  a  static  load  of  3009  g  which  corresponds  to  80%  of  their 
mean  rupture  strength.  Experience  has  shown  that  the  lifetime  of  an  organic 
fiber  strand  at  relatively  high  stress  can  be  reasonably  well  fitted  by  an  ex¬ 
ponential  life  distribution.  Thus  we  assume 

P[lifelength  >  x]  =  exp  [— ■  ?.x ]  for  jj>0, 

where  A  is  unknown.  As  described  above,  we  calculated  the  MLE  of  A  as 

X  —  4.78*10_3/hour . 

Since  the  sample  size  of  100  is  moderately  large,  the  likelihood 

[100 
-  A  ^ 

1 

will  override  in  importance  the  retangular  prior 

.-To(A)  =  J/~l ,  0  <  A  <  J/, 

when  M  »  X.  From  the  prior  ,t0  we  calculate  the  posterior  density  of  1  to  be 
approximately  a  gamma  with  parameters  a  ~  101  and  b  =  20.917  hours. 
(Xote  that  a  is  dimensionless  while  b  is  measured  in  hours).  See  fig.  1.1.  The 
mode  of  the  posterior  density  may  be  used  to  estimate  A. 

Sufficient  statistics.  By  a  statistic ,  we  mean  a  function  (possibly  vector¬ 
valued)  of  the  data.  A  statistic,  of  course,  is  often  used  to  estimate  an  unknown 
parameter  of  interest.  Clearly,  the  complete  set  of  data  observed  is  trivially 
a  statistic.  For  a  large  set  of  data,  working  with  all  of  the  observations  may  be 
tedious  or  even  unmanageable.  Thus  we  are  motivated  to  find  a  statistic  of 
smaller  dimensionality  like  the  sample  mean  (of  dimension  one)  or  the  number 
of  failures  and  the  total  time  on  test  (of  dimension  'two),  but  which  contains 
all  of  the  information  in  the  sample  concerning  the  parameter.  Preferably, 
we  would  like  to  estimate  the  parameter  using  a  statistic  of  lower  dimensionality 
which  summarizes  all  of  the  information  in  the  sample.  We  will  often  use  D 
to  denote  the  observed  data.  For  example,  D  could  stand  for  the  vector  of 
observed  values  (#,,  ... ,  xn).  Later  it  will  denote  more  complicated  data 

sets.  This  motivates 
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Fig.  1.1.  -  Computer  plot  of  posterior  gamma  density  of  lambda. 

1‘4.  Definition.  -  Let  D  denote  the  data  with  probability  density  p(D |0) 
indexed  by  the  parameter  0.  A  statistic  s  is  sufficient  for  0  if  and  only  if,  for 
every  prior  rt(0),  the  posterior 

--r(0jX>)  =  p(D\d)n(d)jjp{D\d)n(d)dd 

depends  on  the  data  D  only  through  the  statistic  s;  i.e.,  for  every  prior  n,  the 
posterior  can  be  written  as  ;r(0|s). 

Intuitively,  knowing  s  we  are  as  informed  about  the  parameter  0  as  when 
we  know  all  the  data  collected. 

There  may  be  several  sufficient  statistics  available  for  estimating  a  param¬ 
eter.  Clearly,  from  among  these  we  would  prefer  to  make  use  of  one  of  lowest 
dimensionality.  For  a  given  parameter,  there  actually  may  be  more  than  one 
sufficient  statistic  of  lowest  dimensionality  and  it  may  be  vector  valued. 

An  easy  way  to  find  a  sufficient  statistic  is  to  examine  the  likelihood  for  the 
kind  of  factorization  displayed  in 

1*5.  Lemma.  -  Let  the  likelihood  L{6\D)  factor  so  that 

L(0\D)  =  g(s\0)h{D) , 

where  h  does  not  depend  on  0.  Then  s  is  a  sufficient  statistic  for  0. 
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Proof.  For  an  arbitrary  prior  density  it,  the  corresponding  posterior  density 
is  given  by 

.t(0|D)  =  L(6\D)x(6)  j jL{d\D)rte)&d  = 

=  g{s\Q)h(D)x(0)n'g(s\Q)MD)rt0)M  =  g(s\0)7i(d)/jg{s\0)n(e)&0. 

The  last  expression  clearly  depends  on  B  only  through  the  statistic  s.  Thus, 
by  definition  1.4,  s  is  sufficient  for  0.  j| 

For  example,  if  xx,...,xn  are  independent  lifelengths  in  the  exponential 

ft 

model,  given  A,  then  n  and  .s‘  ‘  y  is  a  sufficient  statistic  for  the  failure  rate  A 
since  1 

L(X\xx, ...,  xn)  =  An  exp  A  ^  x<j  =  A”  exp  [—  As] . 

The  sample  space.  The  sample  space  is  the  space  or  set  of  possible  sample 
outcomes.  If  we  observe  the  lifetimes  of  n  units,  the  sample  space  is 

(1*1)  S  —  {(*^1 J  ,  •  .  • ,  *Tn)  \^i  ^  1  f  ^  n]  . 

For  n  =  2,  S  is  the  positive  quadrant  (see  fig.  1.2). 


Fig.  1.2. 


However,  we  may  just  as  well  consider  another  sample  space.  Suppose 
we  are  told  only  the  ordered  lifetimes 


■®<l>  *£(*)  ^  ®(n)  j  » 

i.e.  we  no  longer  know  which  unit  fails  at  time  xU),  1  <i< n.  The  sample  space 
corresponding  to  the  ordered  lifetimes  is  now 

(1*8)  S0  =  {(*<!),  a?(al, ...,  <r<„))|0 <#(!)<... <#(„)} . 


For  n  =  2,  we  now  have  fig.  1.3. 
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Fig.  1.3. 


For  sample  space  (1.7)  and  tlie  exponential  model  we  have  the  joint-prob¬ 
ability  density 

A”  exp  [—  A  2  xt]  ,  <i<», 

(1.9)  .r.,,  ....  j-,,)/.)  =  ■  L  i  J 

0  ,  otherwise  . 

For  this  case  L(k\xlf  x2, ...,  x„)  =  p(xly  x2, ...,  a?n|A). 

On  the  other  hand,  for  sample  space  (1.8)  and  the  exponential  model  we  have 

n  !/."  exp  f—  /.  2  .r(1,l  ,  0  <*(t)< ...  <xln) , 

(FI**'  Po(.<(  . . '(r.ii/-)  —  '  1  x  J 

«>,  otherwise. 


The  factor  nl  in  (1.10)  follows  from  the  fact  that  the  ordered  observations  can 
result  from  any  one  of  nl  permutations  of  the  observations  a?t,  x21  x„.  For 

this  case 

Lo(  A  (•'•(l)  »  ...  •  ^(nl)  =  Po(X[i)  ,  ...  ,  J7(n)|A)  , 


where  £0  is  the  lik<*lihood. 

From  (1.9)  and  (1.10)  we  see  that 


0-11)  L(?.\xl,x2 . x„)  x  £0(A|.rul,  .r(2), ....  xin)) . 

It  follows  that,  for  any  prior  for  X,  the  posterior  density  for  1  will  be  the  same 
no  matter  which  of  the  above  sample  spaces  we  choose.  From  (1.11)  and 
lemma  1.5  we  see  that  the  order  statistics  x{2), a?(n)  are  sufficient  for  A. 


2.  -  Selected  life  test  sampling  plans. 

In  this  section  we  illustrate  the  application  of  the  concepts  and  methods 
of  sect.  1  when  estimating  under  each  of  several  commonly  used  life  test 
sampling  plans. 
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Sampling  plan  (a).  Complete  observation  until  a  specified  number  of  failures 
have  occurred.  A  popular  plan  consists  of  putting  n  items  on  test  and  observing 
the  failure  times  of  the  first  k  failures,  where  k  is  specified  in  advance.  The 
motivation  for  following  this  plan  is  to  save  time  in  determining  an  estimate 
of  the  exponential  failure  rate. 

Let 

•£(i)  ^  ,  1  ^  fc  <  >!  j 

denote  the  successive  times  of  failure  of  the  earliest  k  failures;  x(1), ...,  xikl  are 
called  the  first  k  order  statistics  in  a  sample  of  size  n.  The  likelihood  of  this 
observed  outcome  under  the  exponential  model  is  given  by 

L(M-D)  —  ji - k)\  [ll 1  exP  ;-ro>]]  exP  (w  —  *)  Mill  • 

(We  follow  the  usual  convention  that  0!  =1.) 

Simplifying,  we  have 

To  verify  the  expression  preceding  (2.1),  note  that  the  combinatorial  coef¬ 
ficient  ?i !/l ! ...  1 !  (n  —  k)l  represents  the  number  of  ways  of  choosing  one 
observation  to  correspond  to  each  of  ,r(1),  x(il , ... ,  x(ki  and  n—k  observa¬ 
tions  for  the  n—  k  unobserved  failure  times,  from  among  the  n  failure  times 
(of  which  only  the  first  k  are  actually  observed).  The  product  factor  represents 
the  joint  density  of  the  k  actually  observed  failure  times,  given  A.  Finally, 
the  last  factor  represents  the  probability  that  n—k  lifelengths  each  exceed 
av«»  given  /. 

The  expression  in  the  exponent  of  (2.1) 

k 

(2.2)  y  xU)  —  (n  —  k)xM  =  nx{l)  -f-  (n  —  l)(a?(l)  —  ®(l>)  -f  ... 

1 

...  -j-  (w  —  k  -f-  l)(tf(s)  —  X{k_ ,d)  —  T 

represents  the  total  time  on  test  until  the  k-th.  failure.  Note  that  it  is  comprised 
of  nx(1) ,  the  total  time  on  test  observed  until  the  first  failure,  of  (n  —  l)(£(t)  — 
the  total  time  on  test  observed  between  the  first  and  second  failures,...,  and 
of  (n  —  k  -f  l)(x,k)  —  a?(*_i)),  the  total  time  on  test  observed  between  the  penul¬ 
timate  and  the  last  observed  failures.  (Of  course,  it  is  understood  that,  after 
an  item  fails,  it  is  no  longer  under  observation.) 

The  total  time  on  test  statistic  turns  out  to  be  a  very  important  and  useful 
statistic  not  only  in  the  exponential  model,  but  also,  after  appropriate  gen¬ 
eralization,  in  a  large  number  of  other  models  involving  incomplete  data.  In 


154 


R.  k.  nmow  anil  f.  pko.-oiiax 


tlie  exponential  model,  the  total  time  on  test  and  tin*  number  of  observed 
failures  constitute  a  sufficient  statistic  for  A.  In  the  presently  considered  sampling 
plan  under  which  &,  the  number  of  observed  failures,  is  specified  in  advance, 
we  see  from  (2.1)  that  ( k ,  T)  is  sufficient  for  A. 

Suppose  we  assume  a  gamma  prior  on  A 

.-r(A)  =  ba A0-1  exp  [—  &A]/r(a) . 

Using  (2.1),  we  obtain  for  the  posterior 

(2.3)  7t{  A| D)  =  [b  -f  exp  [-  A[6  -  T]]lT(k  -f-  a) , 

also  a  gamma,  but  with  the  shape  parameter  a  of  the  prior  density  replaced 
by  a  -f-  k  and  the  scale  parameter  b  of  the  prior  density  replaced  by  b  -f  T(X(k)). 
2fote  that  the  increment  in  the  scale  parameter  is  T(xfk)),  the  observed  total 
time  on  test.  The  mode  of  the  posterior  density  is 

A0  =  (k  4-  a  -  1  )/[b  +  I] . 

It  is  interesting  to  note  that,  for  .te(A)  =  c,  the  mode  of  the  posterior  is  exactly 
the  well-known  MLE: 

X  =  k/T . 

However,  it  shoidd  be  emphasized  that  this  prior  is  improper  in  the  sense  that 

oo 

J.T(A)dA=  oc. 

0 

We  would  expect  that,  after  collecting  a  set  of  data  from  the  exponential 
distribution,  we  would  have  more  information  concerning  the  unknown  param¬ 
eter  A  than  before;  more  precisely,  the  peakedness  of  the  density  of  A  might 
increase  or  the  coefficient  of  variation  might  decrease.  The  coefficient  of  var¬ 
iation  of  a  distribution  is  the  ratio  of  the  standard  deviation  (assumed  finite) 
to  the  absolute  value  of  the  mean  (assumed  nonzero).  In  the  case  of  the  gamma 
prior  density,  the  mean  is  a/b,  the  variance  is  a/b2,  and  so  the  corresponding 
coefficient  of  variation  is  a_i. 

Under  the  present  sampling  plan,  the  posterior  distribution,  given  in  (2.3), 
is  also  a  gamma  distribution,  but  with  the  shape  parameter  (the  updated  «  a  » 
value)  now  a  -f-  k.  It  follows  that  the  coefficient  of  variation  is  now  reduced 
to  (a  -f  k)~K  Thus,  for  fixed  a,  the  coefficient  of  variation  decreases  roughly 
as  k~*,  where  k  denotes  the  observed  number  of  failures.  This  simple  calcu¬ 
lation  gives  us  a  quantitative  notion  as  to  our  relative  uncertainty  concerning  A 
both  before  and  after  observation  and,  therefore,  concerning  the  decrease  in 
our  uncertainty. 
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Sampling  plan  (b).  Observation  terminated  at  fixed  time  (truncated  sampling). 
Suppose  n  units  are  put  on  life  test  at  time  t  =  0  and  each  is  observed  until 
failure  or  fixed  time  t0,  whichever  occurs  first.  Given  the  random  outcome 
K  —  k  (0 observed  failures  in  [0,  £0]  and  the  times  of  failures,  the  cor¬ 
responding  likelihood  is  given  by 


(2.4)  L().\D)  = 


nl 


1 ! ...  1 !(»  —  k)  ( 


'  Sr 

f]  A  exp  [—  ax{{)]  exp  [—  A(w  —  k)  <0] , 
.1-1 


where  the  product  in  square  brackets  is  defined  as  1  for  k  =  0.  The  verification 
of  the  likelihood  expression  (2.4)  is  similar  to  that  obtained  under  the  previous 
sampling  plan,  leading  to  the  expression  preceding  (2.1).  One  key  difference 
is  that  the  number  n  —  k  surviving  fixed  time  t0  under  the  present  plan  is  random, 
while  the  number  n  —  k  surviving  past  the  k-th  failure  under  the  earlier  plan 
is  specified  in  advance. 

We  may  rewrite  the  likelihood  Z(A|D)  as 

(2.3)  L(?.\D)  = 

It  is  clear  from  (2.5)  and  lemma  1.5  that  the  pair  k  and  the  total  time  on  test 


—  A 

’  k 

y  xU)  -f  (n  —  /. 

)  I’D 

.1-1 

J. 

k 

T  =  %Xu)  -r  (»—  b)t9 

i-i 


are  sufficient  for  A.  Under  the  present  sampling  plan,  both  k  and  the  T(t0)  are 
observed,  and  thus  together  constitute  the  data  D. 

From  (2.5)  the  MLE  is  computed  as 

x  =  k/m)  • 

Under  the  present  sampling  plan  it  is  possible  to  observe  fc  =  0  failures,  leading 
to  a  MLE  of  0.  Such  an  estimate  is  intuitively  unsatisfactory ;  in  this  situation 
an  analysis  based  on  the  posterior  distribution  of  A  is  preferable. 

The  posterior  density  resulting  from  a  gamma  prior  (see  (1.4))  is,  using  (2.5), 

(2.6)  ;t(  A|D)  =  [b  4-  T]t+« A*+-i  exp  [-  A[&  +  T]]/r(k  +  <*) . 

Just  as  in  the  previous  sampling  plan,  the  posterior  coefficient  of  variation  is 
(o  -j-  k)~*,  which  is  smaller  than  a-*,  the  prior  coefficient  of  variation. 

Sampling  plan  (c).  Inverse  binomial  sampling.  A  unit  is  put  on  test  until 
it  fails  or  reaches  a  specified  age  t9,  whichever  occurs  first.  At  this  time,  the 
unit  is  replaced  by  a  new  unit.  This  procedure  is  repeated  sequentially  until  k 
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(specified  in  advance)  failures  are  actually  observed.  The  number  X  of  units 
that  have  to  be  tested  until  k  failures  are  actually  observed  is,  of  course,  a 
random  variable.  This  plan  may  be  used  if  we  have  only  one  test  chamber 
and  we  are  able  to  test  at  most  one  unit  at  a  time. 

Let  Y  —  min  (X.  t0),  where  X  has  exponential  density  X  exp  [—  Xx\,  .r  >  0. 
Then,  conditional  on  Y<  t0  and  t0,  the  density  of  T  is  given  by 


/.  exp  [-  Xy]/(  1  -  exp  [-  /<„]) 


for  0 <y<t0. 


Thus,  if  the  successive  failure  ages  actually  observed  are  denoted  by  ylf 
yk,  the  corresponding  conditional  joint  density  is  given  by 

........  A  exp 

g{lJl . !/i ^  ^  )  ~  J3  i  —  exp  [ —  Xt0]  “  (1  -  exp  [—  Af0])fc  • 

The  probability  that  X  =  n  units  have  to  be  tested  in  order  to  observe  k  actual 
failures  is  given  by 

P[N  —  »|A]  =  (1  —  exp  [—  A<0])fc  exp  [—  Af0(n  —  k)]  for  n>k  . 

It  follows  that,  given  X=  n  and  observed  failure  ages  ylf  y2, ...,  yk,  the 
corresponding  likelihood  is 

L(/.\D)  =  ^  ~  j)  A*  exp  [—  /.  2  1/.]  ^p  [—  Xt0(n  —  k)] , 

0  <c  y i ^ ^  =  1?  ^  • 

Combining  exponentials  we  obtain 

(2.7)  £(x|i>)  =  (j!  ~ ])  **  exP  |~  A  [S  '/■  +(n  —  k) <0j j  , 


0  <  y i  0  >  1  —  lj  1  • 


From  lemma  1.5,  we  conclude  that  (k,  T),  where 


t  =  +  (»- 


is  sufficient  for  X,  since  k  is  fixed  in  advance. 

From  (2.7)  we  also  obtain  the  posterior  density  for  X  based  on  a  gamma 
prior  density  as 


(2.8) 


n{X\D)  =  (b  +  T)k+aXk+a~1  exp  [-  X(b  +  T)]/r(k  +  a) . 
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Note  that  the  posterior  densities  are  identical  under  the  three  sampling  plans 
so  far  considered!  (Compare  (2.8),  (2.6)  and  (2.3).) 

Remarks.  In  the  three  sampling  plans  considered  so  far,  the  number  k  of 
observed  failures  and  the  total  time  on  test  I  are  all  that  we  need  from  the 
observations  in  order  to  complete  our  data  analysis;  the  sufficiency  of  k  and  T 
makes  this  true.  Note  that  this  fact  holds  for  any  choice  of  a  prior  density. 

The  « stopping  rule »  in  each  of  the  three  test  plans  considered  gives  no 
information  about  the  parameter  A,  i.e.  is  noninformative  about  A.  (As  the 
name  implies,  the  stopping  rule  is  simply  the  rule  for  determining  when  testing 
is  to  stop.  The  stopping  rule  is  not  necessarily  the  same  as  the  stopping  time. 
For  example,  in  the  sampling  plan  (a)  we  test  until  k  failures  are  observed  and 
then  stop  further  observation.)  If  the  stopping  rule  were  to  give  information 
about  the  parameter  A,  then  the  total  time  on  test  T  and  the  observed  number 
of  failures  would  not  be  sufficient  for  A. 

Finally,  note  that  in  the  test  plans  considered  thus  far,  the  MLE  is  the  ratio 
of  the  number  of  observed  failures  to  the  total  time  on  test.  This  simple  for¬ 
mula  for  the  MLE  holds  in  most  of  the  testing  procedures  followed  under  the 
exponential  model. 


3.  -  Inference  based  on  mean  life. 

Thus  far  we  have  discussed  inference  for  the  exponential  distribution  based 
on  the  failure  rate  A.  For  many  analysts,  the  mean  life  0  of  the  exponential 
distribution  may  seem  to  be  the  more  appropriate  parameter  to  estimate. 
Note  that  either  parameter  determines  the  other  in  the  exponential  model  since 

00 

0  =  J®A  exp  [—  ax\  da?  =  A"1 . 
o 

Suppose  we  consider  the  simplest  type  of  testing  plan:  n  units  are  put  on 
test  and  observed  until  each  fails.  The  corresponding  mutually  independent 
lifelengths  given  A  are  xt,  x2, xn,  constituting  a  complete  sample  from  the 
exponential  density 


f(x |0)  =  0_l  exp  [—  x/6] . 

The  likelihood  of  the  outcome  is  given  by 

L(B\xy  , ... ,  xn)  =  9~n  exp  [-  0-*  J  J?<]  . 


(3.1) 
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Clearly,  the  MLE  of  0  is  given  by 


Q  =  n  1  ^  xi  • 

i 

(Note  that  Q  =  (A)-1  (see  (1.2)).) 

Suppose  now  we  have  very  little  prior  information  on  the  parameter  0. 
We,  therefore,  assume  a  rectangular  prior  density  on  0: 

n (0)  =  Jf-1  for  0 <0<1/, 

where  M  is  large  (*).  The  corresponding  posterior  density  for  0  may  be  computed 
approximately,  by  Bayes'  theorem  (theorem  1.2),  as 

(3.2)  n{Q\Xi, xn)  =  ba0~ia+1)  exp  [—  bld]/r(a) 

n 

for  0,  a,  b>  0,  where  now  a  =  n  —  1  and  b  —  ^  xt. 

1 

The  density  of  (3.2),  denoted  by  7iab{0),  is  called  the  inverted  gamma  density , 
since,  if  0  is  a  random  variable  with  density  jia>s(0),  then  0-1  =  /  has  gamma 
density  (1.5). 

We  may  verify  readily  that,  if  our  initial  prior  is  of  the  form  .-ra>6(0)  of  (3.2) 
and  we  use  any  one  of  the  sampling  plans,  then  the  corresponding  posterior 
density  is  also  of  the  form  (3.2).  However,  the  parameter  a  of  the  prior  is  re¬ 
placed  by  a  +  k  in  the  posterior,  and  the  parameter  b  of  the  prior  is  replaced 
by  b  -f  T  in  the  posterior.  As  before,  k  denotes  the  number  of  observed  failures 
and  T  is  the  total  time  on  test.  This  follows  readily  from  the  likelihood  expres- 

n 

sion  (3.1)  and  the  fact  that  in  this  case  T  =  2  ^  s*llce  all  lifelengths  are  ob¬ 
served.  1 

The  mean  of  the  posterior  density.  The  mean  of  the  inverted  gamma  density 
given  in  (3.2)  is  readily  calculated: 


jd{bad~(a+1)  exp  [-  bieyr(a)}  d0  =  b/(a  -  1) . 

0 

For  the  inverted  gamma  posterior  density  in  which  the  parameters  are  a  +  k 


(*)  It  would  be  inconsistent  mathematically  to  assume  a  rectangular  prior  on  both  X 
and  S  =  I*1,  even  though  we  have  very  little  prior  information  on  both  parameters. 
We  assume  a  rectangular  prior  for  6  here  to  motivate  the  use  of  a  natural  conjugate 
prior. 
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(in  place  oi'  a)  and  b  -j-  T  (in  place  of  b),  tlie  corresponding  mean  takes  the  form 


(3.3) 


b 

=  ( i  —  w)  - 

a  —  1 


T 


where  tv  =  k/(k  +  a  —  1).  Thus  the  mean  of  the  posterior  density  may  be  written 
as  a  convex  combination  of  the  prior  mean  bf{a—  1)  and  the  maximum-likelihood 
estimate  Tjk  of  the  exponential  life  distribution  mean.  Note  that,  as  k,  the  number 
of  observed  failures,  increases,  the  posterior  mean  attaches  more  weight  to  the 
3ELE  of  the  true  mean  and  less  weight  to  the  prior  mean. 

Table  3.1  summarizes  the  properties  of  the  natural  conjugate  prior  density 
and  of  the  corresponding  posterior  density  for  the  two  different  parametrizations 
of  the  exponential  model. 


Table  3.1.  -  Comparison  of  alternative  parametrizations  of  the  exponential  model. 


Parameter 

Failure  rate,  A 

Mean  life.  0 

likelihood 

A*  exp  [—  AT] 

9~k  exp  [ —  T/0] 

natural  conjugate 

ba  Aa_1  exp  [—  6A];T(a), 

0“0-(a+1)  exp  [ —  i»/0]/r(a), 

prior 

a,  b>  0  (gamma) 

a,  b>  0  (inverted  gamma) 

prior  mode 

a  —  1 

b 

b 

a  —  1 

prior  mean 

aib 

bf{a  —  1),  «>  1 

prior  variance 

afb- 

b-Ha—DHa-*),  a>  2 

prior  coefficient 
of  variation 

a~i 

(a  —  2)“*,  a>  2 

posterior  mean 

(a  —  k);'(b  —  T ) 

(b  a-  T)/(a  -fit-l),  a >  1 

posterior  variance 

a  —  k 

(b  +  T)* 

(b  -  T)* 

(a  -  k  —  1)2(«  -f  Jfc  —  2)’ 
a>  2 

posterior  coefficient 
of  variation 

(o  —  k)-i 

(a  ~  k  —  2)"*,  a>  2 

Note:  k  =  number  of  observed  failures,  T  =  total  time  on  test. 


Arbitrary  prior  density.  In  our  analysis  up  to  now,  we  have  focused  mostly 
on  the  case  in  which  the  prior  density  is  the  natural  conjugate  prior.  In  this 
subsection,  we  expand  our  consideration  to  cases  in  which  the  prior  is  not  nec¬ 
essarily  the  natural  conjugate  prior.  We  obtain  results  similar  to  those  holding 
in  the  natural-conjugate-prior  case. 

Let  jt(Q)  be  a  prior  density  on  0  such  that  {0|.t(0)  >  0}  is  an  interval  on 
[0,  oo).  We  show  in  the  appendix  that 

ao 

P[0>0,|fc,  T]=jn(d\k,  T) dd 


(3.4) 
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is  decreasing  (*)  in  k  >  0  for  fixed  T  and  increasing  (*)  in  T  for  fixed  k,  i.e.  the 
posterior  random  variable  0  is  stochastically  decreasing  in  k  and  stochastically 
increasing  in  T.  In  particular, 


E[6\k  =  0,  T]>E[Q'\k  =  0,  T  =  0] ; 

the  lower  bound  is,  of  course,  the  mean  of  the  prior  density.  Thus  observing 
total  time  on  test  without  observing  failures  tends  to  change  our  belief  about  0 
as  compared  with  our  prior  belief;  we  tend  to  believe  in  a  larger  0.  Howevei, 
for  the  natural  conjugate  prior,  the  variance  of  0  given  k  and  T  decreases  with  k 
but  increases  with  T.  Also  the  coefficient  of  variation  is  constant  in  T  but 
decreases  with  k  (see  table  3.1).  Hence,  if  k  =  0,  large  values  of  T  tend  to  make 
us  optimistic  regarding  0.  However,  failures  are  needed  to  sharpen  the  posterior 


Fig.  3.1.  -  Posterior  mean,  posterior  standard  deviation  and  posterior  coefficient  of 
variation  .as  a  function  of  elapsed  test  time. 


(*)  Terminology :  Throughout  we  use  the  term  « increasing  » in  place  of  *  nondecreasing  » 
and  « decreasing*  in  place  of  *  nonincreasing  ». 
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density .  Similarly,  for  X  =  8~l,  we  can  show  that  for  a  general  prior  density 
on  X  under  mild  regularity  conditions 

T] 

increases  in  k  for  fixed  T  and  decreases  in  T  for  fixed  k,  just  as  we  would  expect. 

3’1.  Example.  -  Suppose  our  prior  density  on  0  is  the  natural  conjugate 
prior  with  a  =  4  and  b  —  12.  We  put  10  units  on  test.  The  earliest  failure1 
occurs  at  X(1)  =  1.  followed  by  a  second  failure  at  A'<2)  —  3.  In  fig.  3.1,  we 
plot  the  posterior  mean,  posterior  standard  deviation  and  posterior  coefficient 
of  variation  as  a  function  of  t,  the  test  time  elapsed.  Table  3.1  may  be  used 
to  generate  the  plots.  Note  that  failures  cause  vertical  drops  in  the  graphs. 

In  fig.  3.2  we  have  plotted  the  posterior  density  for  6  at  selected  times 
during  the  life  test.  The  posterior  density  for  t  =  0  is,  of  course,  the  prior 
density.  Notice  the  shape  of  the  posterior  density  at  t  =  1“  (i.e.  just  before 
the  first  observed  failure)  and  at  t  —  1  (i.e.  just  after  the  first  failure). 
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4.  -  Notes  and  reterences. 

Section  1.  Books  which  emphasize  Bayesian  concepts  as  well  as  their  ap¬ 
plications  are  [5,  6],  In  a  series  of  papers,  Basu  [4,  7,  8]  points  out  the  inade¬ 
quacy  of  other  approaches  to  statistical  inference.  Definition  1.4  of  sufficiency 
is  attributed  by  Basu  to  Kolmogorov  [3]-  The  connection  between  sample 
theory  (Fisher)  sufficiency  and  Kolmogorov  sufficiency  is  discussed  in  [10]. 
Fisher  sufficiency  implies  Kolmogorov  sufficiency.  The  converse  is  false  in 
general.  Natural  conjugate  priors  were  introduced  by  Raiffa  and 
Schlaifer  [11].  For  a  rigorous  characterization  of  natural  conjugate  priors, 
see  [12].  The  reviews  of  Bayesian  statistics  by  Lindley  [3, 13]  present  ex¬ 
cellent  summaries  of  recent  advances  in  the  subject. 

Section  2.  Epstein  and  Sobel  [14, 15]  were  the  first  to  investigate  the 
properties  of  the  exponential  model  applied  to  life  test  plans.  In  a  series  of 
papers  they  made  an  intensive  and  extensive  study  of  the  statistical  features 
of  a  variety  of  exponential  procedures  for  life  testing.  Their  work  greatly 
influenced  reliability  theory  and  practice  at  the  time  and  still  strongly  influences 
current  statistical  practice  in  reliability.  Government  Handbook  H-108  and 
its  subsequent  modifications  represent  the  Government’s  « seal  of  approval » 
and  its  effort  to  implement  the  theory  by  making  readily  available  tables  and 
graphs  for  easy  use  of  the  exponential  model  [16]. 

Section  3.  Stochastic  monotonicity  properties  of  the  posterior  mean  are 
derived  using  the  concepts  and  methods  of  total  positivity.  (A  comprehensive 
and  authoritative  treatment  of  total  positivity  may  be  found  in  [17].)  The¬ 
orem  A.l  in  the  appendix  is  similar  to  lemma  1,  p.  276,  of  Karlin  and  Rubin  [18]. 

In  the  mathematical  insurance  literature  formula  (3.3)  is  called  the  cred¬ 
ibility  formula.  Jewell  [19,  20]  has  discussed  Bayesian  life  testing. 

*  *  * 

This  research  was  supported  by  the  Air  Force  Office  of  Scientific  Research 
(AFSC),  USAF,  under  Grant  AFOSR-77-3179  with  the  University  of  California. 
Reproduction  in  whole  or  in  part  is  permitted  for  any  purpose  of  the  United 
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Appendix 

Posterior  distributions  corresponding  to  arbitrary  priors  and  likelihoods  with 
the  monotone  ratio  property. 


We  will  prove  the  results  mentioned  in  sect.  3. 
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A.1  Theorem.  Lot  .-t„(0)  bo  a  prior  density  on  &.  Lot  data  7)  —  (A,  T)  and 
likelihood  L(0\k,  T)  be  given  sneli  that  for  0,  <  0., 

L(6t\k,  T)IL(0S',  T) 

is  increasing  in  A  and  decreasing  in  T.  Let  g(6)  be  increasing  in  0.  Then 

(A.1)  JV(0)»(8!^,  T)dO 

& 

is  decreasing  in  k  (T  fixed)  and  increasing  in  T  (k  fixed).  The  proof  will  be 
given  shortly. 

A.g  Corollary.  If  L(0\k,  T)  oc  0~k  exp  [ —  Tjd],  then 

00 

(A.2)  P[6  >  0„| k,  T ]  =Jjr(0|A,  T)d6 

a, 

is  decreasing  in  k  (T  fixed)  and  increasing  in  T  {k  fixed).  (See  sect.  3  and  (3.4).) 

Proof.  In  theorem  A.1,  let  g{6)  =  1  for  0>0O  and  0  otherwise.  ]| 

A.3  Corollary.  Let  L(/.\k,  t)  oc  /.k  exp  [ — /.T]  and  g( A)  be  increasing  in  A. 
Then 

oo 

(A.3)  jg(/.)x(/.\k,  T)dl 

0 

is  increasing  in  A-  (T  fixed)  and  decreasing  in  T  (A  fixed). 

Clearly  (A.1)  ((A.3))  implies  that  all  moments  of  the  posterior  distribution 
are  decreasing  (increasing)  in  A  and  increasing  (decreasing)  in  T. 

Proof  of  Theorem  A.1.  For  dx<  d*. 

1(9,%  T)IL{62\k,  T) 

increasing  in  A  implies  that  for  A,  <  A, 

L(f> !A„  T)IL(0\ka,  T ) 

is  increasing  in  0.  It  is  easy  to  see  that  this  in  turn  implies  that 

(A.4)  Jr(0|A,,  T)fn(6\k2,  T)  . 

is  also  increasing  iD  0.  Let 


and 


A  =  {0bi(0|fc„  T)  >  ji(0|A„  T)} 
B  =  {0|w(0|Jfc„  T)  <  ji(0|fcs,  T)}  . 
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Let  a  =  inifj{0)  and  b  =  sup  y(6).  Then  tlie  ratio  in  (AM)  is  increasing  in  0 

6ej  6eB 

implies  that  «>/>. 

Now 


Jflr(0)[:*(0|A-lf  T)-x(d\k2,  T)] d0 > a J[;r( 0 (A-, .  T)  -  n(0|A-2,  T)]d0  + 

9  A 

+  6j[^(0|t.,  T)  —  n(0\k.,,  T)]d0  =  (a  -  fr)J[rr(0|A-t .  T)  —  7t(0|A,,  T)]d0>O  . 

a  a 

Hence,  the  integral  in  (A.l)  is  decreasing  in  A. 

A  similar  argument  proves  that  the  integral  in  (A.l )  is  increasing  in  T.  |j 
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